Overview

Dataset statistics

Number of variables40
Number of observations260601
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory81.5 MiB
Average record size in memory328.0 B

Variable types

Numeric9
Categorical31

Alerts

count_floors_pre_eq is highly overall correlated with height_percentage and 1 other fieldsHigh correlation
height_percentage is highly overall correlated with count_floors_pre_eqHigh correlation
foundation_type is highly overall correlated with roof_type and 4 other fieldsHigh correlation
roof_type is highly overall correlated with foundation_type and 1 other fieldsHigh correlation
ground_floor_type is highly overall correlated with has_superstructure_cement_mortar_brickHigh correlation
other_floor_type is highly overall correlated with count_floors_pre_eq and 1 other fieldsHigh correlation
has_superstructure_mud_mortar_stone is highly overall correlated with foundation_typeHigh correlation
has_superstructure_cement_mortar_brick is highly overall correlated with foundation_type and 1 other fieldsHigh correlation
has_superstructure_rc_non_engineered is highly overall correlated with foundation_typeHigh correlation
has_superstructure_rc_engineered is highly overall correlated with foundation_typeHigh correlation
has_secondary_use is highly overall correlated with has_secondary_use_agriculture and 1 other fieldsHigh correlation
has_secondary_use_agriculture is highly overall correlated with has_secondary_useHigh correlation
has_secondary_use_hotel is highly overall correlated with has_secondary_useHigh correlation
land_surface_condition is highly imbalanced (51.3%)Imbalance
foundation_type is highly imbalanced (60.9%)Imbalance
ground_floor_type is highly imbalanced (59.3%)Imbalance
position is highly imbalanced (50.4%)Imbalance
plan_configuration is highly imbalanced (90.7%)Imbalance
has_superstructure_adobe_mud is highly imbalanced (56.8%)Imbalance
has_superstructure_stone_flag is highly imbalanced (78.4%)Imbalance
has_superstructure_cement_mortar_stone is highly imbalanced (86.9%)Imbalance
has_superstructure_mud_mortar_brick is highly imbalanced (64.1%)Imbalance
has_superstructure_cement_mortar_brick is highly imbalanced (61.5%)Imbalance
has_superstructure_bamboo is highly imbalanced (58.0%)Imbalance
has_superstructure_rc_non_engineered is highly imbalanced (74.6%)Imbalance
has_superstructure_rc_engineered is highly imbalanced (88.2%)Imbalance
has_superstructure_other is highly imbalanced (88.8%)Imbalance
legal_ownership_status is highly imbalanced (86.0%)Imbalance
has_secondary_use_agriculture is highly imbalanced (65.5%)Imbalance
has_secondary_use_hotel is highly imbalanced (78.8%)Imbalance
has_secondary_use_rental is highly imbalanced (93.2%)Imbalance
has_secondary_use_institution is highly imbalanced (98.9%)Imbalance
has_secondary_use_school is highly imbalanced (99.5%)Imbalance
has_secondary_use_industry is highly imbalanced (98.8%)Imbalance
has_secondary_use_health_post is highly imbalanced (99.7%)Imbalance
has_secondary_use_gov_office is highly imbalanced (99.8%)Imbalance
has_secondary_use_use_police is highly imbalanced (99.9%)Imbalance
has_secondary_use_other is highly imbalanced (95.4%)Imbalance
building_id has unique valuesUnique
geo_level_1_id has 4011 (1.5%) zerosZeros
age has 26041 (10.0%) zerosZeros
count_families has 20862 (8.0%) zerosZeros

Reproduction

Analysis started2023-04-23 12:00:12.711197
Analysis finished2023-04-23 12:02:33.336593
Duration2 minutes and 20.63 seconds
Software versionydata-profiling v0.0.dev0
Download configurationconfig.json

Variables

building_id
Real number (ℝ)

Distinct260601
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean525675.48
Minimum4
Maximum1052934
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.0 MiB
2023-04-23T14:02:33.675694image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile52114
Q1261190
median525757
Q3789762
95-th percentile1000724
Maximum1052934
Range1052930
Interquartile range (IQR)528572

Descriptive statistics

Standard deviation304545
Coefficient of variation (CV)0.57934031
Kurtosis-1.203879
Mean525675.48
Median Absolute Deviation (MAD)264277
Skewness0.0018823567
Sum1.3699156 × 1011
Variance9.2747656 × 1010
MonotonicityNot monotonic
2023-04-23T14:02:34.058243image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
802906 1
 
< 0.1%
680296 1
 
< 0.1%
802531 1
 
< 0.1%
544902 1
 
< 0.1%
823257 1
 
< 0.1%
373540 1
 
< 0.1%
627590 1
 
< 0.1%
421951 1
 
< 0.1%
241191 1
 
< 0.1%
1024699 1
 
< 0.1%
Other values (260591) 260591
> 99.9%
ValueCountFrequency (%)
4 1
< 0.1%
8 1
< 0.1%
12 1
< 0.1%
16 1
< 0.1%
17 1
< 0.1%
25 1
< 0.1%
28 1
< 0.1%
31 1
< 0.1%
34 1
< 0.1%
36 1
< 0.1%
ValueCountFrequency (%)
1052934 1
< 0.1%
1052931 1
< 0.1%
1052929 1
< 0.1%
1052926 1
< 0.1%
1052921 1
< 0.1%
1052915 1
< 0.1%
1052911 1
< 0.1%
1052909 1
< 0.1%
1052908 1
< 0.1%
1052906 1
< 0.1%

geo_level_1_id
Real number (ℝ)

Distinct31
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.900353
Minimum0
Maximum30
Zeros4011
Zeros (%)1.5%
Negative0
Negative (%)0.0%
Memory size4.0 MiB
2023-04-23T14:02:34.376493image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3
Q17
median12
Q321
95-th percentile27
Maximum30
Range30
Interquartile range (IQR)14

Descriptive statistics

Standard deviation8.0336166
Coefficient of variation (CV)0.57794334
Kurtosis-1.2132488
Mean13.900353
Median Absolute Deviation (MAD)6
Skewness0.27253035
Sum3622446
Variance64.538996
MonotonicityNot monotonic
2023-04-23T14:02:34.640389image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
6 24381
 
9.4%
26 22615
 
8.7%
10 22079
 
8.5%
17 21813
 
8.4%
8 19080
 
7.3%
7 18994
 
7.3%
20 17216
 
6.6%
21 14889
 
5.7%
4 14568
 
5.6%
27 12532
 
4.8%
Other values (21) 72434
27.8%
ValueCountFrequency (%)
0 4011
 
1.5%
1 2701
 
1.0%
2 931
 
0.4%
3 7540
 
2.9%
4 14568
5.6%
5 2690
 
1.0%
6 24381
9.4%
7 18994
7.3%
8 19080
7.3%
9 3958
 
1.5%
ValueCountFrequency (%)
30 2686
 
1.0%
29 396
 
0.2%
28 265
 
0.1%
27 12532
4.8%
26 22615
8.7%
25 5624
 
2.2%
24 1310
 
0.5%
23 1121
 
0.4%
22 6252
 
2.4%
21 14889
5.7%

geo_level_2_id
Real number (ℝ)

Distinct1414
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean701.07469
Minimum0
Maximum1427
Zeros38
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size4.0 MiB
2023-04-23T14:02:34.915265image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile69
Q1350
median702
Q31050
95-th percentile1377
Maximum1427
Range1427
Interquartile range (IQR)700

Descriptive statistics

Standard deviation412.71073
Coefficient of variation (CV)0.58868298
Kurtosis-1.1882325
Mean701.07469
Median Absolute Deviation (MAD)349
Skewness0.028957381
Sum1.8270076 × 108
Variance170330.15
MonotonicityNot monotonic
2023-04-23T14:02:35.163700image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
39 4038
 
1.5%
158 2520
 
1.0%
181 2080
 
0.8%
1387 2040
 
0.8%
157 1897
 
0.7%
363 1760
 
0.7%
463 1740
 
0.7%
673 1704
 
0.7%
533 1684
 
0.6%
883 1626
 
0.6%
Other values (1404) 239512
91.9%
ValueCountFrequency (%)
0 38
 
< 0.1%
1 204
0.1%
3 77
 
< 0.1%
4 315
0.1%
5 25
 
< 0.1%
6 2
 
< 0.1%
7 100
 
< 0.1%
8 120
 
< 0.1%
9 333
0.1%
10 354
0.1%
ValueCountFrequency (%)
1427 6
 
< 0.1%
1426 286
0.1%
1425 466
0.2%
1424 7
 
< 0.1%
1423 3
 
< 0.1%
1422 216
0.1%
1421 254
0.1%
1420 10
 
< 0.1%
1419 95
 
< 0.1%
1418 152
 
0.1%

geo_level_3_id
Real number (ℝ)

Distinct11595
Distinct (%)4.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6257.8761
Minimum0
Maximum12567
Zeros2
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size4.0 MiB
2023-04-23T14:02:35.411236image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile611
Q13073
median6270
Q39412
95-th percentile11927
Maximum12567
Range12567
Interquartile range (IQR)6339

Descriptive statistics

Standard deviation3646.3696
Coefficient of variation (CV)0.58268485
Kurtosis-1.2138965
Mean6257.8761
Median Absolute Deviation (MAD)3171
Skewness0.00039351209
Sum1.6308088 × 109
Variance13296012
MonotonicityNot monotonic
2023-04-23T14:02:35.651355image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
633 651
 
0.2%
9133 647
 
0.2%
621 530
 
0.2%
11246 470
 
0.2%
2005 466
 
0.2%
11440 455
 
0.2%
7723 443
 
0.2%
9229 381
 
0.1%
2452 349
 
0.1%
12258 312
 
0.1%
Other values (11585) 255897
98.2%
ValueCountFrequency (%)
0 2
 
< 0.1%
1 6
 
< 0.1%
3 9
 
< 0.1%
5 14
 
< 0.1%
6 21
 
< 0.1%
7 2
 
< 0.1%
8 31
< 0.1%
9 3
 
< 0.1%
10 1
 
< 0.1%
11 62
< 0.1%
ValueCountFrequency (%)
12567 1
 
< 0.1%
12565 7
 
< 0.1%
12564 6
 
< 0.1%
12563 24
< 0.1%
12562 3
 
< 0.1%
12561 19
< 0.1%
12560 17
 
< 0.1%
12559 6
 
< 0.1%
12558 6
 
< 0.1%
12557 44
< 0.1%

count_floors_pre_eq
Real number (ℝ)

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.1297232
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.0 MiB
2023-04-23T14:02:35.838275image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median2
Q32
95-th percentile3
Maximum9
Range8
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.72766455
Coefficient of variation (CV)0.34167095
Kurtosis2.3225979
Mean2.1297232
Median Absolute Deviation (MAD)0
Skewness0.83411296
Sum555008
Variance0.52949569
MonotonicityNot monotonic
2023-04-23T14:02:36.006036image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
2 156623
60.1%
3 55617
 
21.3%
1 40441
 
15.5%
4 5424
 
2.1%
5 2246
 
0.9%
6 209
 
0.1%
7 39
 
< 0.1%
8 1
 
< 0.1%
9 1
 
< 0.1%
ValueCountFrequency (%)
1 40441
 
15.5%
2 156623
60.1%
3 55617
 
21.3%
4 5424
 
2.1%
5 2246
 
0.9%
6 209
 
0.1%
7 39
 
< 0.1%
8 1
 
< 0.1%
9 1
 
< 0.1%
ValueCountFrequency (%)
9 1
 
< 0.1%
8 1
 
< 0.1%
7 39
 
< 0.1%
6 209
 
0.1%
5 2246
 
0.9%
4 5424
 
2.1%
3 55617
 
21.3%
2 156623
60.1%
1 40441
 
15.5%

age
Real number (ℝ)

Distinct42
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26.535029
Minimum0
Maximum995
Zeros26041
Zeros (%)10.0%
Negative0
Negative (%)0.0%
Memory size4.0 MiB
2023-04-23T14:02:36.214744image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q110
median15
Q330
95-th percentile60
Maximum995
Range995
Interquartile range (IQR)20

Descriptive statistics

Standard deviation73.565937
Coefficient of variation (CV)2.7724084
Kurtosis157.24824
Mean26.535029
Median Absolute Deviation (MAD)10
Skewness12.192494
Sum6915055
Variance5411.947
MonotonicityNot monotonic
2023-04-23T14:02:36.421720image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=42)
ValueCountFrequency (%)
10 38896
14.9%
15 36010
13.8%
5 33697
12.9%
20 32182
12.3%
0 26041
10.0%
25 24366
9.3%
30 18028
6.9%
35 10710
 
4.1%
40 10559
 
4.1%
50 7257
 
2.8%
Other values (32) 22855
8.8%
ValueCountFrequency (%)
0 26041
10.0%
5 33697
12.9%
10 38896
14.9%
15 36010
13.8%
20 32182
12.3%
25 24366
9.3%
30 18028
6.9%
35 10710
 
4.1%
40 10559
 
4.1%
45 4711
 
1.8%
ValueCountFrequency (%)
995 1390
0.5%
200 106
 
< 0.1%
195 2
 
< 0.1%
190 3
 
< 0.1%
185 1
 
< 0.1%
180 7
 
< 0.1%
175 5
 
< 0.1%
170 6
 
< 0.1%
165 2
 
< 0.1%
160 6
 
< 0.1%

area_percentage
Real number (ℝ)

Distinct84
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.0180506
Minimum1
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.0 MiB
2023-04-23T14:02:36.629910image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q15
median7
Q39
95-th percentile16
Maximum100
Range99
Interquartile range (IQR)4

Descriptive statistics

Standard deviation4.3922309
Coefficient of variation (CV)0.54779287
Kurtosis30.438258
Mean8.0180506
Median Absolute Deviation (MAD)2
Skewness3.5260823
Sum2089512
Variance19.291693
MonotonicityNot monotonic
2023-04-23T14:02:36.849688image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6 42013
16.1%
7 36752
14.1%
5 32724
12.6%
8 28445
10.9%
9 22199
8.5%
4 19236
7.4%
10 15613
 
6.0%
11 13907
 
5.3%
3 11837
 
4.5%
12 7581
 
2.9%
Other values (74) 30294
11.6%
ValueCountFrequency (%)
1 90
 
< 0.1%
2 3181
 
1.2%
3 11837
 
4.5%
4 19236
7.4%
5 32724
12.6%
6 42013
16.1%
7 36752
14.1%
8 28445
10.9%
9 22199
8.5%
10 15613
 
6.0%
ValueCountFrequency (%)
100 1
 
< 0.1%
96 3
< 0.1%
90 1
 
< 0.1%
86 5
< 0.1%
85 4
< 0.1%
84 3
< 0.1%
83 3
< 0.1%
82 1
 
< 0.1%
80 1
 
< 0.1%
78 1
 
< 0.1%

height_percentage
Real number (ℝ)

Distinct27
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.4343652
Minimum2
Maximum32
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.0 MiB
2023-04-23T14:02:37.045330image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile3
Q14
median5
Q36
95-th percentile9
Maximum32
Range30
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.9184182
Coefficient of variation (CV)0.35301607
Kurtosis14.318526
Mean5.4343652
Median Absolute Deviation (MAD)1
Skewness1.8082618
Sum1416201
Variance3.6803285
MonotonicityNot monotonic
2023-04-23T14:02:37.225068image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%)
5 78513
30.1%
6 46477
17.8%
4 37763
14.5%
7 35465
13.6%
3 25957
 
10.0%
8 13902
 
5.3%
2 9305
 
3.6%
9 5376
 
2.1%
10 4492
 
1.7%
11 917
 
0.4%
Other values (17) 2434
 
0.9%
ValueCountFrequency (%)
2 9305
 
3.6%
3 25957
 
10.0%
4 37763
14.5%
5 78513
30.1%
6 46477
17.8%
7 35465
13.6%
8 13902
 
5.3%
9 5376
 
2.1%
10 4492
 
1.7%
11 917
 
0.4%
ValueCountFrequency (%)
32 75
< 0.1%
31 1
 
< 0.1%
28 2
 
< 0.1%
26 2
 
< 0.1%
25 3
 
< 0.1%
24 4
 
< 0.1%
23 11
 
< 0.1%
21 13
 
< 0.1%
20 33
< 0.1%
19 7
 
< 0.1%
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.0 MiB
t
216757 
n
35528 
o
 
8316

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowt
2nd rowo
3rd rowt
4th rowt
5th rowt

Common Values

ValueCountFrequency (%)
t 216757
83.2%
n 35528
 
13.6%
o 8316
 
3.2%

Length

2023-04-23T14:02:37.400112image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-23T14:02:37.627341image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
t 216757
83.2%
n 35528
 
13.6%
o 8316
 
3.2%

Most occurring characters

ValueCountFrequency (%)
t 216757
83.2%
n 35528
 
13.6%
o 8316
 
3.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 260601
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 216757
83.2%
n 35528
 
13.6%
o 8316
 
3.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 260601
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 216757
83.2%
n 35528
 
13.6%
o 8316
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 216757
83.2%
n 35528
 
13.6%
o 8316
 
3.2%

foundation_type
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.0 MiB
r
219196 
w
 
15118
u
 
14260
i
 
10579
h
 
1448

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowr
2nd rowr
3rd rowr
4th rowr
5th rowr

Common Values

ValueCountFrequency (%)
r 219196
84.1%
w 15118
 
5.8%
u 14260
 
5.5%
i 10579
 
4.1%
h 1448
 
0.6%

Length

2023-04-23T14:02:37.776930image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-23T14:02:37.953723image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
r 219196
84.1%
w 15118
 
5.8%
u 14260
 
5.5%
i 10579
 
4.1%
h 1448
 
0.6%

Most occurring characters

ValueCountFrequency (%)
r 219196
84.1%
w 15118
 
5.8%
u 14260
 
5.5%
i 10579
 
4.1%
h 1448
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 260601
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 219196
84.1%
w 15118
 
5.8%
u 14260
 
5.5%
i 10579
 
4.1%
h 1448
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 260601
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 219196
84.1%
w 15118
 
5.8%
u 14260
 
5.5%
i 10579
 
4.1%
h 1448
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 219196
84.1%
w 15118
 
5.8%
u 14260
 
5.5%
i 10579
 
4.1%
h 1448
 
0.6%

roof_type
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.0 MiB
n
182842 
q
61576 
x
 
16183

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rown
2nd rown
3rd rown
4th rown
5th rown

Common Values

ValueCountFrequency (%)
n 182842
70.2%
q 61576
 
23.6%
x 16183
 
6.2%

Length

2023-04-23T14:02:38.124821image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-23T14:02:38.299750image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
n 182842
70.2%
q 61576
 
23.6%
x 16183
 
6.2%

Most occurring characters

ValueCountFrequency (%)
n 182842
70.2%
q 61576
 
23.6%
x 16183
 
6.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 260601
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 182842
70.2%
q 61576
 
23.6%
x 16183
 
6.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 260601
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 182842
70.2%
q 61576
 
23.6%
x 16183
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 182842
70.2%
q 61576
 
23.6%
x 16183
 
6.2%

ground_floor_type
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.0 MiB
f
209619 
x
24877 
v
24593 
z
 
1004
m
 
508

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowf
2nd rowx
3rd rowf
4th rowf
5th rowf

Common Values

ValueCountFrequency (%)
f 209619
80.4%
x 24877
 
9.5%
v 24593
 
9.4%
z 1004
 
0.4%
m 508
 
0.2%

Length

2023-04-23T14:02:38.446106image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-23T14:02:38.627823image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
f 209619
80.4%
x 24877
 
9.5%
v 24593
 
9.4%
z 1004
 
0.4%
m 508
 
0.2%

Most occurring characters

ValueCountFrequency (%)
f 209619
80.4%
x 24877
 
9.5%
v 24593
 
9.4%
z 1004
 
0.4%
m 508
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 260601
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
f 209619
80.4%
x 24877
 
9.5%
v 24593
 
9.4%
z 1004
 
0.4%
m 508
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 260601
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
f 209619
80.4%
x 24877
 
9.5%
v 24593
 
9.4%
z 1004
 
0.4%
m 508
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
f 209619
80.4%
x 24877
 
9.5%
v 24593
 
9.4%
z 1004
 
0.4%
m 508
 
0.2%

other_floor_type
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.0 MiB
q
165282 
x
43448 
j
39843 
s
 
12028

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowq
2nd rowq
3rd rowx
4th rowx
5th rowx

Common Values

ValueCountFrequency (%)
q 165282
63.4%
x 43448
 
16.7%
j 39843
 
15.3%
s 12028
 
4.6%

Length

2023-04-23T14:02:38.789791image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-23T14:02:38.961491image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
q 165282
63.4%
x 43448
 
16.7%
j 39843
 
15.3%
s 12028
 
4.6%

Most occurring characters

ValueCountFrequency (%)
q 165282
63.4%
x 43448
 
16.7%
j 39843
 
15.3%
s 12028
 
4.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 260601
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
q 165282
63.4%
x 43448
 
16.7%
j 39843
 
15.3%
s 12028
 
4.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 260601
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
q 165282
63.4%
x 43448
 
16.7%
j 39843
 
15.3%
s 12028
 
4.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
q 165282
63.4%
x 43448
 
16.7%
j 39843
 
15.3%
s 12028
 
4.6%

position
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.0 MiB
s
202090 
t
42896 
j
 
13282
o
 
2333

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowt
2nd rows
3rd rowt
4th rows
5th rows

Common Values

ValueCountFrequency (%)
s 202090
77.5%
t 42896
 
16.5%
j 13282
 
5.1%
o 2333
 
0.9%

Length

2023-04-23T14:02:39.120972image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-23T14:02:39.287766image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
s 202090
77.5%
t 42896
 
16.5%
j 13282
 
5.1%
o 2333
 
0.9%

Most occurring characters

ValueCountFrequency (%)
s 202090
77.5%
t 42896
 
16.5%
j 13282
 
5.1%
o 2333
 
0.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 260601
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 202090
77.5%
t 42896
 
16.5%
j 13282
 
5.1%
o 2333
 
0.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 260601
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 202090
77.5%
t 42896
 
16.5%
j 13282
 
5.1%
o 2333
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 202090
77.5%
t 42896
 
16.5%
j 13282
 
5.1%
o 2333
 
0.9%
Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.0 MiB
d
250072 
q
 
5692
u
 
3649
s
 
346
c
 
325
Other values (5)
 
517

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowd
2nd rowd
3rd rowd
4th rowd
5th rowd

Common Values

ValueCountFrequency (%)
d 250072
96.0%
q 5692
 
2.2%
u 3649
 
1.4%
s 346
 
0.1%
c 325
 
0.1%
a 252
 
0.1%
o 159
 
0.1%
m 46
 
< 0.1%
n 38
 
< 0.1%
f 22
 
< 0.1%

Length

2023-04-23T14:02:39.435706image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-23T14:02:39.628660image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
d 250072
96.0%
q 5692
 
2.2%
u 3649
 
1.4%
s 346
 
0.1%
c 325
 
0.1%
a 252
 
0.1%
o 159
 
0.1%
m 46
 
< 0.1%
n 38
 
< 0.1%
f 22
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
d 250072
96.0%
q 5692
 
2.2%
u 3649
 
1.4%
s 346
 
0.1%
c 325
 
0.1%
a 252
 
0.1%
o 159
 
0.1%
m 46
 
< 0.1%
n 38
 
< 0.1%
f 22
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 260601
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
d 250072
96.0%
q 5692
 
2.2%
u 3649
 
1.4%
s 346
 
0.1%
c 325
 
0.1%
a 252
 
0.1%
o 159
 
0.1%
m 46
 
< 0.1%
n 38
 
< 0.1%
f 22
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 260601
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
d 250072
96.0%
q 5692
 
2.2%
u 3649
 
1.4%
s 346
 
0.1%
c 325
 
0.1%
a 252
 
0.1%
o 159
 
0.1%
m 46
 
< 0.1%
n 38
 
< 0.1%
f 22
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
d 250072
96.0%
q 5692
 
2.2%
u 3649
 
1.4%
s 346
 
0.1%
c 325
 
0.1%
a 252
 
0.1%
o 159
 
0.1%
m 46
 
< 0.1%
n 38
 
< 0.1%
f 22
 
< 0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.0 MiB
0
237500 
1
 
23101

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
0 237500
91.1%
1 23101
 
8.9%

Length

2023-04-23T14:02:39.831667image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-23T14:02:40.004137image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0 237500
91.1%
1 23101
 
8.9%

Most occurring characters

ValueCountFrequency (%)
0 237500
91.1%
1 23101
 
8.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 237500
91.1%
1 23101
 
8.9%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 237500
91.1%
1 23101
 
8.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 237500
91.1%
1 23101
 
8.9%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.0 MiB
1
198561 
0
62040 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
1 198561
76.2%
0 62040
 
23.8%

Length

2023-04-23T14:02:40.147680image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-23T14:02:40.308525image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
1 198561
76.2%
0 62040
 
23.8%

Most occurring characters

ValueCountFrequency (%)
1 198561
76.2%
0 62040
 
23.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 198561
76.2%
0 62040
 
23.8%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 198561
76.2%
0 62040
 
23.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 198561
76.2%
0 62040
 
23.8%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.0 MiB
0
251654 
1
 
8947

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 251654
96.6%
1 8947
 
3.4%

Length

2023-04-23T14:02:40.464958image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-23T14:02:40.625405image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0 251654
96.6%
1 8947
 
3.4%

Most occurring characters

ValueCountFrequency (%)
0 251654
96.6%
1 8947
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 251654
96.6%
1 8947
 
3.4%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 251654
96.6%
1 8947
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 251654
96.6%
1 8947
 
3.4%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.0 MiB
0
255849 
1
 
4752

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 255849
98.2%
1 4752
 
1.8%

Length

2023-04-23T14:02:40.755896image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-23T14:02:40.919973image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0 255849
98.2%
1 4752
 
1.8%

Most occurring characters

ValueCountFrequency (%)
0 255849
98.2%
1 4752
 
1.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 255849
98.2%
1 4752
 
1.8%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 255849
98.2%
1 4752
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 255849
98.2%
1 4752
 
1.8%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.0 MiB
0
242840 
1
 
17761

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 242840
93.2%
1 17761
 
6.8%

Length

2023-04-23T14:02:41.050714image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-23T14:02:41.229893image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0 242840
93.2%
1 17761
 
6.8%

Most occurring characters

ValueCountFrequency (%)
0 242840
93.2%
1 17761
 
6.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 242840
93.2%
1 17761
 
6.8%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 242840
93.2%
1 17761
 
6.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 242840
93.2%
1 17761
 
6.8%

has_superstructure_cement_mortar_brick
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.0 MiB
0
240986 
1
 
19615

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 240986
92.5%
1 19615
 
7.5%

Length

2023-04-23T14:02:41.383176image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-23T14:02:41.557058image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0 240986
92.5%
1 19615
 
7.5%

Most occurring characters

ValueCountFrequency (%)
0 240986
92.5%
1 19615
 
7.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 240986
92.5%
1 19615
 
7.5%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 240986
92.5%
1 19615
 
7.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 240986
92.5%
1 19615
 
7.5%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.0 MiB
0
194151 
1
66450 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 194151
74.5%
1 66450
 
25.5%

Length

2023-04-23T14:02:41.694522image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-23T14:02:41.859078image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0 194151
74.5%
1 66450
 
25.5%

Most occurring characters

ValueCountFrequency (%)
0 194151
74.5%
1 66450
 
25.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 194151
74.5%
1 66450
 
25.5%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 194151
74.5%
1 66450
 
25.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 194151
74.5%
1 66450
 
25.5%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.0 MiB
0
238447 
1
 
22154

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 238447
91.5%
1 22154
 
8.5%

Length

2023-04-23T14:02:41.996042image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-23T14:02:42.153735image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0 238447
91.5%
1 22154
 
8.5%

Most occurring characters

ValueCountFrequency (%)
0 238447
91.5%
1 22154
 
8.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 238447
91.5%
1 22154
 
8.5%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 238447
91.5%
1 22154
 
8.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 238447
91.5%
1 22154
 
8.5%

has_superstructure_rc_non_engineered
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.0 MiB
0
249502 
1
 
11099

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 249502
95.7%
1 11099
 
4.3%

Length

2023-04-23T14:02:42.294991image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-23T14:02:42.465664image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0 249502
95.7%
1 11099
 
4.3%

Most occurring characters

ValueCountFrequency (%)
0 249502
95.7%
1 11099
 
4.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 249502
95.7%
1 11099
 
4.3%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 249502
95.7%
1 11099
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 249502
95.7%
1 11099
 
4.3%

has_superstructure_rc_engineered
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.0 MiB
0
256468 
1
 
4133

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 256468
98.4%
1 4133
 
1.6%

Length

2023-04-23T14:02:42.593663image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-23T14:02:42.750381image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0 256468
98.4%
1 4133
 
1.6%

Most occurring characters

ValueCountFrequency (%)
0 256468
98.4%
1 4133
 
1.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 256468
98.4%
1 4133
 
1.6%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 256468
98.4%
1 4133
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 256468
98.4%
1 4133
 
1.6%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.0 MiB
0
256696 
1
 
3905

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 256696
98.5%
1 3905
 
1.5%

Length

2023-04-23T14:02:42.879381image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-23T14:02:43.061851image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0 256696
98.5%
1 3905
 
1.5%

Most occurring characters

ValueCountFrequency (%)
0 256696
98.5%
1 3905
 
1.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 256696
98.5%
1 3905
 
1.5%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 256696
98.5%
1 3905
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 256696
98.5%
1 3905
 
1.5%
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.0 MiB
v
250939 
a
 
5512
w
 
2677
r
 
1473

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowv
2nd rowv
3rd rowv
4th rowv
5th rowv

Common Values

ValueCountFrequency (%)
v 250939
96.3%
a 5512
 
2.1%
w 2677
 
1.0%
r 1473
 
0.6%

Length

2023-04-23T14:02:43.217407image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-23T14:02:43.396062image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
v 250939
96.3%
a 5512
 
2.1%
w 2677
 
1.0%
r 1473
 
0.6%

Most occurring characters

ValueCountFrequency (%)
v 250939
96.3%
a 5512
 
2.1%
w 2677
 
1.0%
r 1473
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 260601
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
v 250939
96.3%
a 5512
 
2.1%
w 2677
 
1.0%
r 1473
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 260601
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
v 250939
96.3%
a 5512
 
2.1%
w 2677
 
1.0%
r 1473
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
v 250939
96.3%
a 5512
 
2.1%
w 2677
 
1.0%
r 1473
 
0.6%

count_families
Real number (ℝ)

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.98394864
Minimum0
Maximum9
Zeros20862
Zeros (%)8.0%
Negative0
Negative (%)0.0%
Memory size4.0 MiB
2023-04-23T14:02:43.539699image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median1
Q31
95-th percentile2
Maximum9
Range9
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.41838898
Coefficient of variation (CV)0.42521424
Kurtosis17.670943
Mean0.98394864
Median Absolute Deviation (MAD)0
Skewness1.6347579
Sum256418
Variance0.17504934
MonotonicityNot monotonic
2023-04-23T14:02:43.670179image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
1 226115
86.8%
0 20862
 
8.0%
2 11294
 
4.3%
3 1802
 
0.7%
4 389
 
0.1%
5 104
 
< 0.1%
6 22
 
< 0.1%
7 7
 
< 0.1%
9 4
 
< 0.1%
8 2
 
< 0.1%
ValueCountFrequency (%)
0 20862
 
8.0%
1 226115
86.8%
2 11294
 
4.3%
3 1802
 
0.7%
4 389
 
0.1%
5 104
 
< 0.1%
6 22
 
< 0.1%
7 7
 
< 0.1%
8 2
 
< 0.1%
9 4
 
< 0.1%
ValueCountFrequency (%)
9 4
 
< 0.1%
8 2
 
< 0.1%
7 7
 
< 0.1%
6 22
 
< 0.1%
5 104
 
< 0.1%
4 389
 
0.1%
3 1802
 
0.7%
2 11294
 
4.3%
1 226115
86.8%
0 20862
 
8.0%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.0 MiB
0
231445 
1
29156 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 231445
88.8%
1 29156
 
11.2%

Length

2023-04-23T14:02:43.827399image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-23T14:02:43.993103image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0 231445
88.8%
1 29156
 
11.2%

Most occurring characters

ValueCountFrequency (%)
0 231445
88.8%
1 29156
 
11.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 231445
88.8%
1 29156
 
11.2%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 231445
88.8%
1 29156
 
11.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 231445
88.8%
1 29156
 
11.2%

has_secondary_use_agriculture
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.0 MiB
0
243824 
1
 
16777

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 243824
93.6%
1 16777
 
6.4%

Length

2023-04-23T14:02:44.139667image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-23T14:02:45.559566image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0 243824
93.6%
1 16777
 
6.4%

Most occurring characters

ValueCountFrequency (%)
0 243824
93.6%
1 16777
 
6.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 243824
93.6%
1 16777
 
6.4%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 243824
93.6%
1 16777
 
6.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 243824
93.6%
1 16777
 
6.4%

has_secondary_use_hotel
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.0 MiB
0
251838 
1
 
8763

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 251838
96.6%
1 8763
 
3.4%

Length

2023-04-23T14:02:45.738498image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-23T14:02:45.936485image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0 251838
96.6%
1 8763
 
3.4%

Most occurring characters

ValueCountFrequency (%)
0 251838
96.6%
1 8763
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 251838
96.6%
1 8763
 
3.4%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 251838
96.6%
1 8763
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 251838
96.6%
1 8763
 
3.4%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.0 MiB
0
258490 
1
 
2111

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 258490
99.2%
1 2111
 
0.8%

Length

2023-04-23T14:02:46.150862image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-23T14:02:46.352891image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0 258490
99.2%
1 2111
 
0.8%

Most occurring characters

ValueCountFrequency (%)
0 258490
99.2%
1 2111
 
0.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 258490
99.2%
1 2111
 
0.8%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 258490
99.2%
1 2111
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 258490
99.2%
1 2111
 
0.8%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.0 MiB
0
260356 
1
 
245

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 260356
99.9%
1 245
 
0.1%

Length

2023-04-23T14:02:46.507764image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-23T14:02:46.697530image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0 260356
99.9%
1 245
 
0.1%

Most occurring characters

ValueCountFrequency (%)
0 260356
99.9%
1 245
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 260356
99.9%
1 245
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 260356
99.9%
1 245
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 260356
99.9%
1 245
 
0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.0 MiB
0
260507 
1
 
94

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 260507
> 99.9%
1 94
 
< 0.1%

Length

2023-04-23T14:02:46.834241image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-23T14:02:46.994224image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0 260507
> 99.9%
1 94
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0 260507
> 99.9%
1 94
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 260507
> 99.9%
1 94
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 260507
> 99.9%
1 94
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 260507
> 99.9%
1 94
 
< 0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.0 MiB
0
260322 
1
 
279

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 260322
99.9%
1 279
 
0.1%

Length

2023-04-23T14:02:47.139799image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-23T14:02:47.295800image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0 260322
99.9%
1 279
 
0.1%

Most occurring characters

ValueCountFrequency (%)
0 260322
99.9%
1 279
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 260322
99.9%
1 279
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 260322
99.9%
1 279
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 260322
99.9%
1 279
 
0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.0 MiB
0
260552 
1
 
49

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 260552
> 99.9%
1 49
 
< 0.1%

Length

2023-04-23T14:02:47.428044image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-23T14:02:47.601542image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0 260552
> 99.9%
1 49
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0 260552
> 99.9%
1 49
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 260552
> 99.9%
1 49
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 260552
> 99.9%
1 49
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 260552
> 99.9%
1 49
 
< 0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.0 MiB
0
260563 
1
 
38

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 260563
> 99.9%
1 38
 
< 0.1%

Length

2023-04-23T14:02:47.728758image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-23T14:02:47.886291image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0 260563
> 99.9%
1 38
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0 260563
> 99.9%
1 38
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 260563
> 99.9%
1 38
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 260563
> 99.9%
1 38
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 260563
> 99.9%
1 38
 
< 0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.0 MiB
0
260578 
1
 
23

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 260578
> 99.9%
1 23
 
< 0.1%

Length

2023-04-23T14:02:48.022298image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-23T14:02:48.176291image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0 260578
> 99.9%
1 23
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0 260578
> 99.9%
1 23
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 260578
> 99.9%
1 23
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 260578
> 99.9%
1 23
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 260578
> 99.9%
1 23
 
< 0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.0 MiB
0
259267 
1
 
1334

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 259267
99.5%
1 1334
 
0.5%

Length

2023-04-23T14:02:48.306291image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-23T14:02:48.459291image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0 259267
99.5%
1 1334
 
0.5%

Most occurring characters

ValueCountFrequency (%)
0 259267
99.5%
1 1334
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 259267
99.5%
1 1334
 
0.5%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 259267
99.5%
1 1334
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 259267
99.5%
1 1334
 
0.5%

damage_grade
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.0 MiB
2
148259 
3
87218 
1
25124 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters260601
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row2
3rd row3
4th row2
5th row3

Common Values

ValueCountFrequency (%)
2 148259
56.9%
3 87218
33.5%
1 25124
 
9.6%

Length

2023-04-23T14:02:48.586471image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-23T14:02:48.739607image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
2 148259
56.9%
3 87218
33.5%
1 25124
 
9.6%

Most occurring characters

ValueCountFrequency (%)
2 148259
56.9%
3 87218
33.5%
1 25124
 
9.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 260601
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 148259
56.9%
3 87218
33.5%
1 25124
 
9.6%

Most occurring scripts

ValueCountFrequency (%)
Common 260601
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 148259
56.9%
3 87218
33.5%
1 25124
 
9.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 260601
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 148259
56.9%
3 87218
33.5%
1 25124
 
9.6%

Interactions

2023-04-23T14:02:22.847590image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:01:50.326144image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:01:56.845853image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:00.240501image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:04.583524image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:08.830945image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:12.419554image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:15.527313image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:19.186662image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:23.301978image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:01:50.871981image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:01:57.195865image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:00.642782image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:05.001771image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:09.246600image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:12.783939image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:15.874092image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:19.605390image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:23.672198image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:01:51.289104image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:01:57.533747image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:01.008502image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:05.386162image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:09.638978image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:13.170019image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:16.220748image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:19.958282image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:23.999460image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:01:51.798269image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:01:57.869137image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:01.377299image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:05.857091image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:09.993909image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:13.515027image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:16.565977image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:20.273098image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:24.377591image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:01:52.191604image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:01:58.197068image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:01.723218image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:06.387053image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:10.463642image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:13.851367image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:16.909061image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:20.673100image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:24.745067image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:01:55.285060image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:01:58.566107image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:02.080589image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:06.956439image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:10.856080image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:14.189507image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:17.315882image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:21.038001image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:25.068928image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:01:55.654637image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:01:58.912665image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:02.456774image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:07.489977image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:11.290814image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:14.535763image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:17.723066image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:21.337101image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:25.393140image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:01:56.044180image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:01:59.283798image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:03.143155image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:08.007720image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:11.698902image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:14.856336image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:18.264237image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:21.691990image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:25.758073image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:01:56.475236image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:01:59.710710image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:03.764957image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:08.477344image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:12.082005image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:15.201893image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:18.769130image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-23T14:02:22.448459image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Correlations

2023-04-23T14:02:48.951039image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
building_idgeo_level_1_idgeo_level_2_idgeo_level_3_idcount_floors_pre_eqagearea_percentageheight_percentagecount_familiesland_surface_conditionfoundation_typeroof_typeground_floor_typeother_floor_typepositionplan_configurationhas_superstructure_adobe_mudhas_superstructure_mud_mortar_stonehas_superstructure_stone_flaghas_superstructure_cement_mortar_stonehas_superstructure_mud_mortar_brickhas_superstructure_cement_mortar_brickhas_superstructure_timberhas_superstructure_bamboohas_superstructure_rc_non_engineeredhas_superstructure_rc_engineeredhas_superstructure_otherlegal_ownership_statushas_secondary_usehas_secondary_use_agriculturehas_secondary_use_hotelhas_secondary_use_rentalhas_secondary_use_institutionhas_secondary_use_schoolhas_secondary_use_industryhas_secondary_use_health_posthas_secondary_use_gov_officehas_secondary_use_use_policehas_secondary_use_otherdamage_grade
building_id1.000-0.0030.000-0.0000.0000.000-0.0020.000-0.0010.0000.0000.0000.0000.0000.0000.0030.0000.0000.0000.0000.0040.0000.0000.0000.0000.0000.0000.0000.0000.0000.0030.0000.0000.0000.0000.0000.0000.0000.0030.002
geo_level_1_id-0.0031.000-0.0670.004-0.088-0.0610.038-0.0770.0370.0330.1990.2060.1100.1290.1360.0310.2690.3330.1130.0660.2850.2150.2070.2350.0450.0620.0600.0890.0980.1260.0280.0460.0060.0070.0100.0040.0060.0040.0440.270
geo_level_2_id0.000-0.0671.0000.0010.0430.035-0.0240.037-0.0120.0490.0980.0870.0690.0710.0790.0180.0780.1510.0630.0260.1000.1250.0890.0920.0630.0680.0410.0640.0160.0360.0300.0430.0080.0040.0070.0000.0030.0060.0300.069
geo_level_3_id-0.0000.0040.0011.000-0.016-0.0030.000-0.018-0.0020.0330.0330.0350.0260.0290.0270.0080.0380.0370.0350.0190.0540.0340.0500.0370.0310.0360.0260.0330.0150.0230.0120.0160.0080.0060.0050.0030.0020.0000.0150.027
count_floors_pre_eq0.000-0.0880.043-0.0161.0000.2550.1250.7550.0780.0470.1440.1820.1230.5820.3140.0360.2040.3580.0500.0300.3920.2580.0990.0800.1070.1310.0330.0670.0740.0550.1410.0660.0260.0240.0220.0080.0110.0030.0100.154
age0.000-0.0610.035-0.0030.2551.000-0.0170.1970.0470.0180.0350.0180.0310.0210.1140.0150.0900.0720.0260.0050.1500.0040.0150.0120.0100.0110.0030.0130.0070.0130.0050.0080.0000.0000.0000.0000.0000.0020.0060.020
area_percentage-0.0020.038-0.0240.0000.125-0.0171.0000.2100.0780.0190.1660.2510.1640.1870.0480.0470.0370.2380.0050.0750.0640.2130.0570.0300.1890.2230.0120.0120.1170.0200.1580.1060.0560.0770.0190.0150.0360.0000.0150.101
height_percentage0.000-0.0770.037-0.0180.7550.1970.2101.0000.0630.0190.1670.2350.1180.2990.2110.0200.1520.2630.0230.0320.2550.1440.0660.0620.1740.2310.0110.0380.1280.0580.1970.1060.0500.0410.0140.0180.0210.0000.0190.094
count_families-0.0010.037-0.012-0.0020.0780.0470.0780.0631.0000.0140.0550.0800.0470.0710.0330.0080.0340.0630.0090.0110.0360.0490.0350.0300.0600.0650.0060.0110.1140.0510.0800.0940.0330.0290.0290.0120.0170.0190.0190.061
land_surface_condition0.0000.0330.0490.0330.0470.0180.0190.0190.0141.0000.0320.0390.0450.0370.0330.0210.0210.0800.0460.0130.0640.0600.0470.0320.0120.0280.0350.0210.0090.0060.0130.0080.0040.0040.0000.0000.0000.0020.0170.029
foundation_type0.0000.1990.0980.0330.1440.0350.1660.1670.0550.0321.0000.5480.3560.4110.0980.0570.1050.5530.1440.2030.0760.5110.3410.3030.5060.5430.1150.1500.1730.0550.2560.1850.0680.0380.0220.0190.0260.0080.0150.305
roof_type0.0000.2060.0870.0350.1820.0180.2510.2350.0800.0390.5481.0000.4720.5220.1260.0630.0730.4380.0430.0840.0360.4200.1420.0940.4460.4670.0200.0290.1600.0600.2330.1880.0650.0320.0160.0170.0220.0080.0100.241
ground_floor_type0.0000.1100.0690.0260.1230.0310.1640.1180.0470.0450.3560.4721.0000.3640.0800.0590.0820.4990.1300.1530.0500.5860.1020.0830.3640.3630.0280.0320.1550.0680.2490.1620.0600.0370.0250.0180.0200.0060.0390.264
other_floor_type0.0000.1290.0710.0290.5820.0210.1870.2990.0710.0370.4110.5220.3641.0000.1130.0640.0910.4510.1290.0970.0380.4430.1610.0700.3880.4200.0370.0650.1820.0650.2610.1960.0750.0440.0230.0190.0260.0050.0210.246
position0.0000.1360.0790.0270.3140.1140.0480.2110.0330.0330.0980.1260.0800.1131.0000.0270.1940.2830.0210.0290.3500.1190.0540.0560.0890.0950.0000.0300.1180.0340.2000.0600.0150.0070.0120.0100.0080.0000.0080.045
plan_configuration0.0030.0310.0180.0080.0360.0150.0470.0200.0080.0210.0570.0630.0590.0640.0271.0000.0270.1210.0160.0260.0440.1060.0280.0230.0490.0420.0190.0140.0300.0170.0490.0280.0120.0320.0050.0000.0000.0120.0000.057
has_superstructure_adobe_mud0.0000.2690.0780.0380.2040.0900.0370.1520.0340.0210.1050.0730.0820.0910.1940.0271.0000.3070.0070.0140.3150.0370.0120.0110.0370.0370.0570.0510.0130.0030.0120.0030.0040.0000.0000.0020.0010.0000.0100.075
has_superstructure_mud_mortar_stone0.0000.3330.1510.0370.3580.0720.2380.2630.0630.0800.5530.4380.4990.4510.2830.1210.3071.0000.0340.1040.3760.4710.0400.0550.2220.2240.0420.1470.0870.0580.1590.1180.0360.0230.0250.0080.0110.0020.0050.335
has_superstructure_stone_flag0.0000.1130.0630.0350.0500.0260.0050.0230.0090.0460.1440.0430.1300.1290.0210.0160.0070.0341.0000.0370.0330.0440.1250.0780.0080.0210.0660.0100.0000.0100.0090.0110.0000.0000.0030.0000.0010.0000.0000.066
has_superstructure_cement_mortar_stone0.0000.0660.0260.0190.0300.0050.0750.0320.0110.0130.2030.0840.1530.0970.0290.0260.0140.1040.0371.0000.0000.0790.0140.0030.0760.0250.0120.0120.0420.0160.0720.0340.0070.0050.0060.0030.0110.0030.0140.060
has_superstructure_mud_mortar_brick0.0040.2850.1000.0540.3920.1500.0640.2550.0360.0640.0760.0360.0500.0380.3500.0440.3150.3760.0330.0001.0000.0310.0000.0000.0290.0260.0260.0350.0100.0390.0250.0180.0010.0000.0110.0000.0020.0000.0040.064
has_superstructure_cement_mortar_brick0.0000.2150.1250.0340.2580.0040.2130.1440.0490.0600.5110.4200.5860.4430.1190.1060.0370.4710.0440.0790.0311.0000.0590.0550.1390.1210.0060.0780.0770.0540.1390.1090.0320.0190.0260.0080.0070.0040.0000.280
has_superstructure_timber0.0000.2070.0890.0500.0990.0150.0570.0660.0350.0470.3410.1420.1020.1610.0540.0280.0120.0400.1250.0140.0000.0591.0000.4380.0270.0690.1040.1050.0230.0030.0280.0260.0050.0030.0000.0040.0030.0010.0140.071
has_superstructure_bamboo0.0000.2350.0920.0370.0800.0120.0300.0620.0300.0320.3030.0940.0830.0700.0560.0230.0110.0550.0780.0030.0000.0550.4381.0000.0200.0370.1170.0870.0220.0040.0310.0190.0040.0030.0020.0030.0000.0010.0080.064
has_superstructure_rc_non_engineered0.0000.0450.0630.0310.1070.0100.1890.1740.0600.0120.5060.4460.3640.3880.0890.0490.0370.2220.0080.0760.0290.1390.0270.0201.0000.0120.0180.0080.1080.0230.1580.1030.0360.0200.0150.0070.0040.0000.0000.187
has_superstructure_rc_engineered0.0000.0620.0680.0360.1310.0110.2230.2310.0650.0280.5430.4670.3630.4200.0950.0420.0370.2240.0210.0250.0260.1210.0690.0370.0121.0000.0100.0130.1040.0290.1400.1310.0500.0240.0040.0100.0300.0030.0080.237
has_superstructure_other0.0000.0600.0410.0260.0330.0030.0120.0110.0060.0350.1150.0200.0280.0370.0000.0190.0570.0420.0660.0120.0260.0060.1040.1170.0180.0101.0000.0200.0000.0060.0070.0000.0010.0000.0000.0000.0000.0000.0060.033
legal_ownership_status0.0000.0890.0640.0330.0670.0130.0120.0380.0110.0210.1500.0290.0320.0650.0300.0140.0510.1470.0100.0120.0350.0780.1050.0870.0080.0130.0201.0000.0250.0120.0420.0050.0000.0080.0040.0000.0000.0000.0160.070
has_secondary_use0.0000.0980.0160.0150.0740.0070.1170.1280.1140.0090.1730.1600.1550.1820.1180.0300.0130.0870.0000.0420.0100.0770.0230.0220.1080.1040.0000.0251.0000.7390.5260.2550.0860.0530.0920.0380.0330.0260.2020.080
has_secondary_use_agriculture0.0000.1260.0360.0230.0550.0130.0200.0580.0510.0060.0550.0600.0680.0650.0340.0170.0030.0580.0100.0160.0390.0540.0030.0040.0230.0290.0060.0120.7391.0000.0490.0240.0080.0040.0080.0020.0020.0000.0850.047
has_secondary_use_hotel0.0030.0280.0300.0120.1410.0050.1580.1970.0800.0130.2560.2330.2490.2610.2000.0490.0120.1590.0090.0720.0250.1390.0280.0310.1580.1400.0070.0420.5260.0491.0000.0170.0050.0020.0050.0000.0000.0000.0030.108
has_secondary_use_rental0.0000.0460.0430.0160.0660.0080.1060.1060.0940.0080.1850.1880.1620.1960.0600.0280.0030.1180.0110.0340.0180.1090.0260.0190.1030.1310.0000.0050.2550.0240.0171.0000.0010.0000.0010.0000.0000.0000.0010.101
has_secondary_use_institution0.0000.0060.0080.0080.0260.0000.0560.0500.0330.0040.0680.0650.0600.0750.0150.0120.0040.0360.0000.0070.0010.0320.0050.0040.0360.0500.0010.0000.0860.0080.0050.0011.0000.0000.0000.0000.0000.0000.0030.033
has_secondary_use_school0.0000.0070.0040.0060.0240.0000.0770.0410.0290.0040.0380.0320.0370.0440.0070.0320.0000.0230.0000.0050.0000.0190.0030.0030.0200.0240.0000.0080.0530.0040.0020.0000.0001.0000.0000.0000.0000.0000.0000.014
has_secondary_use_industry0.0000.0100.0070.0050.0220.0000.0190.0140.0290.0000.0220.0160.0250.0230.0120.0050.0000.0250.0030.0060.0110.0260.0000.0020.0150.0040.0000.0040.0920.0080.0050.0010.0000.0001.0000.0000.0000.0000.0030.013
has_secondary_use_health_post0.0000.0040.0000.0030.0080.0000.0150.0180.0120.0000.0190.0170.0180.0190.0100.0000.0020.0080.0000.0030.0000.0080.0040.0030.0070.0100.0000.0000.0380.0020.0000.0000.0000.0000.0001.0000.0000.0000.0000.008
has_secondary_use_gov_office0.0000.0060.0030.0020.0110.0000.0360.0210.0170.0000.0260.0220.0200.0260.0080.0000.0010.0110.0010.0110.0020.0070.0030.0000.0040.0300.0000.0000.0330.0020.0000.0000.0000.0000.0000.0001.0000.0000.0000.010
has_secondary_use_use_police0.0000.0040.0060.0000.0030.0020.0000.0000.0190.0020.0080.0080.0060.0050.0000.0120.0000.0020.0000.0030.0000.0040.0010.0010.0000.0030.0000.0000.0260.0000.0000.0000.0000.0000.0000.0000.0001.0000.0000.000
has_secondary_use_other0.0030.0440.0300.0150.0100.0060.0150.0190.0190.0170.0150.0100.0390.0210.0080.0000.0100.0050.0000.0140.0040.0000.0140.0080.0000.0080.0060.0160.2020.0850.0030.0010.0030.0000.0030.0000.0000.0001.0000.016
damage_grade0.0020.2700.0690.0270.1540.0200.1010.0940.0610.0290.3050.2410.2640.2460.0450.0570.0750.3350.0660.0600.0640.2800.0710.0640.1870.2370.0330.0700.0800.0470.1080.1010.0330.0140.0130.0080.0100.0000.0161.000

Missing values

2023-04-23T14:02:26.789380image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
A simple visualization of nullity by column.
2023-04-23T14:02:29.595645image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

building_idgeo_level_1_idgeo_level_2_idgeo_level_3_idcount_floors_pre_eqagearea_percentageheight_percentageland_surface_conditionfoundation_typeroof_typeground_floor_typeother_floor_typepositionplan_configurationhas_superstructure_adobe_mudhas_superstructure_mud_mortar_stonehas_superstructure_stone_flaghas_superstructure_cement_mortar_stonehas_superstructure_mud_mortar_brickhas_superstructure_cement_mortar_brickhas_superstructure_timberhas_superstructure_bamboohas_superstructure_rc_non_engineeredhas_superstructure_rc_engineeredhas_superstructure_otherlegal_ownership_statuscount_familieshas_secondary_usehas_secondary_use_agriculturehas_secondary_use_hotelhas_secondary_use_rentalhas_secondary_use_institutionhas_secondary_use_schoolhas_secondary_use_industryhas_secondary_use_health_posthas_secondary_use_gov_officehas_secondary_use_use_policehas_secondary_use_otherdamage_grade
080290664871219823065trnfqtd11000000000v1000000000003
1288308900281221087ornxqsd01000000000v1000000000002
29494721363897321055trnfxtd01000000000v1000000000003
3590882224181069421065trnfxsd01000011000v1000000000002
420194411131148833089trnfxsd10000000000v1000000000003
53330208558608921095trnfqsd01000000000v1110000000002
672845194751206622534nrnxqsd01000000000v1000000000003
747551520323122362086twqvxsu00000110000v1000000000001
84411260757721921586trqfqsd01000010000v1000000000002
99895002688699410134tinvjsd00000100000v1000000000001
building_idgeo_level_1_idgeo_level_2_idgeo_level_3_idcount_floors_pre_eqagearea_percentageheight_percentageland_surface_conditionfoundation_typeroof_typeground_floor_typeother_floor_typepositionplan_configurationhas_superstructure_adobe_mudhas_superstructure_mud_mortar_stonehas_superstructure_stone_flaghas_superstructure_cement_mortar_stonehas_superstructure_mud_mortar_brickhas_superstructure_cement_mortar_brickhas_superstructure_timberhas_superstructure_bamboohas_superstructure_rc_non_engineeredhas_superstructure_rc_engineeredhas_superstructure_otherlegal_ownership_statuscount_familieshas_secondary_usehas_secondary_use_agriculturehas_secondary_use_hotelhas_secondary_use_rentalhas_secondary_use_institutionhas_secondary_use_schoolhas_secondary_use_industryhas_secondary_use_health_posthas_secondary_use_gov_officehas_secondary_use_use_policehas_secondary_use_otherdamage_grade
26059156080520368598012553nrnfjsd01000000000v1110000000003
260592207683101382190322555trnfqsd01000010000v1000000000002
2605932264218767861325135trnfqsd01000000000v1110000000002
260594159555271811537601312trnfxjd00001000000v1000000000002
2605958270128268471822085trnfqsd01000000000v1000000000003
260596688636251335162115563nrnfjsq01000000000v1000000000002
2605976694851771520602065trnfqsd01000000000v1000000000003
2605986025121751816335567trqfqsd01000000000v1000000000003
26059915140926391851210146trxvsjd00000100000v1000000000002
260600747594219910131076nrnfqjd01000000000v3000000000003